All Questions
Tagged with hyperparametercross-validation
19 questions
1vote
0answers
28views
Choosing the number of features via cross-validation
I have an algorithm that trains a binary predictive model for a specified number of features from the dataset (features are all of the same type, but not all important.) Thus, the number of features ...
1vote
1answer
131views
Can I apply different hyper-parameters for different sliding time windows?
Question Can I apply different hyper-parameters for different training sets? I can see the point of using the shared parameters but I cannot see the point of using shared hyper-parameters. The ...
1vote
0answers
284views
Adaptive Resampling in Caret with Pre-specified Validation Set
I was wondering if this is the correct way to get adaptive sampling in caret working with a pre-specified validation set using index. I can get this to work using the 'cv' method in caret like so <...
1vote
1answer
42views
Asynchronous Hyperparameter Optimization - Dependency between iterations
When using Asynchronous Hyperparameter Optimization packages such as scikit optimize or hyperopt with cross validation (e.g., cv = 2 or 4) and setting the number of iteration to N (e.g., N=100), ...
4votes
2answers
1kviews
Shuffle the data before splitting into folds
I am running a 4-folds cross validation hyperparameter tuning using sklearn's 'cross_validate' and 'KFold' functions. Assuming that my training dataset is already shuffled, then should I for each ...
4votes
1answer
833views
ROC AUC score is much less than average cross validation score
Using Lending club Dataset to find the propability of default. I am using hyperopt library to fine tune hyper parameter for an XGBclassifier and trying to maximize the ROC AUC score. I am also using ...
1vote
0answers
352views
Validation curve/RandomizedSearchCV difference train and test score
Ive build a RF model for an imbalanced data set that after feature selection has an F1 score of 54.26%. I am now trying to do hyper parameter tuning using RandomizedSearchCV, after creating validation ...
2votes
2answers
163views
Hyperparameter optimization, ensembling instead of selecting with CV criteria
While burning CPUs performing a CV selection on a thin grid put on some hyperparameter space. I am using the `scikit-learn' API, for which the end result is a single point on the hyperparameter space, ...
0votes
1answer
379views
Difference between validation and prediction
As a follow-up to Validate via predict() or via fit()? I wonder about the difference between validation and prediction. To keep it simple, I will refer to train, <...
3votes
1answer
669views
Hyperparameter tuning and cross validation
I have some confusion about proper usage of cross-validation to tune hyperparameters and evaluate estimator performance and generalizeability. As I understand it, this would be the process you would ...
1vote
2answers
345views
Is it a good idea to tune the number of folds for cross validation when tuning hyperparameters of RF
I'm new to data science. I'm trying to get the best model for Random Forest. Unfortunately, I'm not sure if my idea can produce a good generalized model. 1) I have split data to TrainingSet (70%) and ...
10votes
4answers
7kviews
Which is first ? Tuning the parameters or selecting the model
I've been reading about how we split our data into 3 parts; generally, we use the validation set to help us tune the parameters and the test set to have an unbiased estimate on how well does our model ...
4votes
1answer
4kviews
Hyperparameter tuning for stacked models
I'm reading the following kaggle post for learning how to incorporate model stacking http://blog.kaggle.com/2016/12/27/a-kagglers-guide-to-model-stacking-in-practice/ in ML models. The structure ...
1vote
0answers
470views
How to perform platt scaling for hyperparameter-optimized model?
I'm using Python and have a best estimator from a grid search. Wanted to be able to calibrate the probability output accordingly, but would like to know more about implementing platt scaling. From ...
3votes
1answer
694views
Hyper parameters and ValidationSet
Please correct me if I am wrong. "Training Set is used for calculating parameters of a machine learning model, Validation data is used for calculating hyperparameters of the same model (we use same ...